** Data reading and variable selection from raw data
** Family Survey of the Dutch Population (Family-enquete Nederlandse Bevolking) 1992


** 01. Reading data **
/*The master file has been merged with the adult file, the codes are as follows:
cd /*insert you work directory here*/
use ".\p1245g.dta"
keep respnr age v40_a v59_f v59_s v59_d v59_e
sort respnr
save temp1.dta,replace

use p1245c.dta, clear
sort respnr
save temp2.dta,replace

use temp1.dta
merge m:m respnr using temp2.dta
keep if _merge==3
drop _merge
save p1245.dta,replace*/

cap log close
clear all
set more off
cd /*insert you work directory here*/
use ".\p1245.dta"


** 02. Consructing year and country variables **

ge year=1992
lab var year "survey year"

ge country=528
lab var country "ISO country code"
//Netherlands: 528 (see "ISO Country Codes.pdf) 


** 03. ID variables **

ge pid=respnr
lab var pid "respondent id"


** 04. Basic Demographics (Sex and Age/birth year) **

ge sex=v38
lab def sex 1 "male" 2 "female"
lab val sex sex
lab var sex "respondent's sex"

tab age

ge birthyr=1992-age
lab var birthyr "respondent's year of birth"


** 05. Siblings **
ge nbro=v33_z
lab var nbro "number of brothers"
ge nsis=v33_d
lab var nsis "number of sisters"

ge nsibs=nbro+nsis
lab var nsibs "number of siblings"

//birth order not available


** 06. Own education **

//level of education
ge educ=v40_a
lab var educ "level of education of the respondent"

lab def educ 1 "less than elementary school" 2 "primary school" 3 "low sec vocational" 4 "low sec academic" ///
5 "mid sec vocational" 6 "mid sec academic" 7 "high sec academic" 8 "tert vocational" 9 "university" 10 "postgraduate"

lab val educ educ


** 07. Parents' education: Father and/or Mother **

ge faeduc=v4_v
lab var faeduc "father's highest education obtained"
ge moeduc=v4_m
lab var moeduc "mother's highest education obtained"
lab val faeduc moeduc educ


** 08. Own occupation **

rename v59_f occ
lab var occ "respondent's last/current occupation"

ge ind=v59_s
lab var ind "respondent's last/current industry code"
lab def ind -1 "no answer"
lab val ind ind

rename v59_d empstat
lab var empstat "respondent's last/current employment status"

ge subord=v59_e
lab var subord "respondent's last/current subordinates"
lab def subord -1 "no answer" 1 "no subordination" 2 "1-2 subordinate" 3 "3-10 subordinates" 4 "11-24 subordinates" 5 "25 or more subordinates"
lab val subord subord

** 09. Parents' occupation **

//father
rename v6_f faocc15
lab var faocc15 "father's occupation when the respondent was 15"
rename v6_s faind15
lab var faind15 "father's industry when the respondent was 15"
lab val faind15 ind

rename v12 faocc
lab var faocc "father's current/last occupation"
rename v12_s faind
lab var faind "father's current/last industry"
lab val faind ind

//mother
rename v17_f maocc15
lab var maocc15 "mother's occupation when the respondent was 15"
rename v17_s maind15
lab var maind15 "mother's industry when the respondent was 15"
lab val maind15 ind

rename v22 maocc
lab var maocc "mother's current/last occupation"
rename v22_s maind
lab var faind "mother's current/last industry"
lab val maind ind


** 10. Tabulate the Identified Variables **

log using /*insert you work directory here*/, replace text

** Data reading and variable selection from raw data
** Family Survey of the Dutch Population (Family-enquete Nederlandse Bevolking) 1992

** Sex **
tab sex

** Age, Birth Year **
sum age birthyr, d

** Siblings **
sum nsibs nbro nsis, d

** R's Own Education **
tab1 educ 

** Parental Education **
tab1 faeduc moeduc 

** R's Own Occupation **
tab1 occ ind empstat subord

** Parental Occupation **
tab1 faocc15 faind15 faocc faind maocc15 maind15 maocc maind


log close

** 11. Keep the identified variables only

keep year country pid sex age birthyr ///
	 nsibs nbro nsis ///
	 educ faeduc moeduc ///
	 occ ind empstat subord ///
	 faocc15 faind15 faocc faind maocc15 maind15 maocc maind


** 12. Save the Data File **

saveold /*insert you work directory here*/, replace



** 13. Homoginising education **
** Own Education **
rename educ educ_cat

ge educ_yrs=6 if educ_cat==1
replace educ_yrs=6 if educ_cat==2
replace educ_yrs=9.5 if educ_cat==3
replace educ_yrs=10 if educ_cat==4
replace educ_yrs=11 if educ_cat==5
replace educ_yrs=11 if educ_cat==6
replace educ_yrs=12 if educ_cat==7
replace educ_yrs=15 if educ_cat==8
replace educ_yrs=17 if educ_cat==9
replace educ_yrs=21 if educ_cat==10
lab var educ_yrs "respondent highest education in years"

ge educ_ISCED=020 if educ_cat==1
replace educ_ISCED=100 if educ_cat==2
replace educ_ISCED=254 if educ_cat==3
replace educ_ISCED=244 if educ_cat==4
replace educ_ISCED=344 if educ_cat==5
replace educ_ISCED=344 if educ_cat==6
replace educ_ISCED=354 if educ_cat==7
replace educ_ISCED=655 if educ_cat==8
replace educ_ISCED=600 if educ_cat==9
replace educ_ISCED=747 if educ_cat==10
lab var educ_ISCED "respondent highest education in ISCED code

** Parents Education **
//father's education is actually father's
ge faeduc_flag=1 

rename faeduc faeduc_cat
rename moeduc maeduc_cat

ge faeduc_yrs=6 if faeduc_cat==1
replace faeduc_yrs=6 if faeduc_cat==2
replace faeduc_yrs=9.5 if faeduc_cat==3
replace faeduc_yrs=10 if faeduc_cat==4
replace faeduc_yrs=11 if faeduc_cat==5
replace faeduc_yrs=11 if faeduc_cat==6
replace faeduc_yrs=12 if faeduc_cat==7
replace faeduc_yrs=15 if faeduc_cat==8
replace faeduc_yrs=17 if faeduc_cat==9
replace faeduc_yrs=21 if faeduc_cat==10
lab var faeduc_yrs "father's education in years"

ge maeduc_yrs=6 if maeduc_cat==1
replace maeduc_yrs=6 if maeduc_cat==2
replace maeduc_yrs=9.5 if maeduc_cat==3
replace maeduc_yrs=10 if maeduc_cat==4
replace maeduc_yrs=11 if maeduc_cat==5
replace maeduc_yrs=11 if maeduc_cat==6
replace maeduc_yrs=12 if maeduc_cat==7
replace maeduc_yrs=15 if maeduc_cat==8
replace maeduc_yrs=17 if maeduc_cat==9
replace maeduc_yrs=21 if maeduc_cat==10
lab var maeduc_yrs "mother's education in years"

ge faeduc_ISCED=020 if faeduc_cat==1
replace faeduc_ISCED=100 if faeduc_cat==2
replace faeduc_ISCED=254 if faeduc_cat==3
replace faeduc_ISCED=244 if faeduc_cat==4
replace faeduc_ISCED=344 if faeduc_cat==5
replace faeduc_ISCED=344 if faeduc_cat==6
replace faeduc_ISCED=354 if faeduc_cat==7
replace faeduc_ISCED=655 if faeduc_cat==8
replace faeduc_ISCED=600 if faeduc_cat==9
replace faeduc_ISCED=747 if faeduc_cat==10
replace faeduc_ISCED=. if faeduc_cat==11
lab var faeduc_ISCED "father highest education in ISCED code"

ge maeduc_ISCED=020 if maeduc_cat==1
replace maeduc_ISCED=100 if maeduc_cat==2
replace maeduc_ISCED=254 if maeduc_cat==3
replace maeduc_ISCED=244 if maeduc_cat==4
replace maeduc_ISCED=344 if maeduc_cat==5
replace maeduc_ISCED=344 if maeduc_cat==6
replace maeduc_ISCED=354 if maeduc_cat==7
replace maeduc_ISCED=655 if maeduc_cat==8
replace maeduc_ISCED=600 if maeduc_cat==9
replace maeduc_ISCED=747 if maeduc_cat==10
replace maeduc_ISCED=. if maeduc_cat==11
lab var maeduc_ISCED "mother highest education in ISCED code"

** 14. Homoginising sibling **
//cutoff
ge nsibs_flag=99
lab var nsibs_flag "cutoff of total number of siblings"
ge nsis_flag=99
lab var nsis_flag "cutoff of number of sisters"
ge nbro_flag=99
lab var nbro_flag "cutoff of number of brothers"

lab def nsib_flag 99 "no cutoff"
lab val nsis_flag nbro_flag nsibs_flag nsib_flag


** 15. Tab Education and Sibling Variables **
tab1 sex age birthyr
tab1 educ_cat educ_yrs faeduc_cat faeduc_yrs maeduc_cat maeduc_yrs faeduc_flag 
tab1 nsis nbro nsibs nsis_flag nbro_flag nsibs_flag


** 16. Save the Data File **

saveold /*insert you work directory here*/, replace
